深度估计是需要对环境的3D评估的广大应用程序的基石,例如机器人,增强现实和自主驱动来命名几个。深度估计的一个突出技术是立体声匹配,其具有多种优点:它被认为比其他深度传感技术更容易进入,可以实时产生密集的深度估计,并从近年来深度学习的进步中受益匪浅。然而,用于立体图像的深度估计的当前技术仍然遭受内置缺点。为了重建深度,立体声匹配算法首先在应用几何三角测量之前估计左图像和右图像之间的视差图。一个简单的分析表明,深度误差与对象距离相当成比例。因此,恒定的差异误差被转换为远离相机的物体的大深度误差。为了缓解这种二次关系,我们提出了一种简单但有效的方法,使用细化网络进行深度估计。我们展示了分析和经验结果表明所提出的学习程序减少了这种二次关系。我们评估了众所周知的基准和数据集的提出的细化程序,如演唱者和基提数据集,并在深度精度度量中展示了显着的改进。
translated by 谷歌翻译
Attribute-controlled text rewriting, also known as text style-transfer, has a crucial role in regulating attributes and biases of textual training data and a machine generated text. In this work we present SimpleStyle, a minimalist yet effective approach for style-transfer composed of two simple ingredients: controlled denoising and output filtering. Despite the simplicity of our approach, which can be succinctly described with a few lines of code, it is competitive with previous state-of-the-art methods both in automatic and in human evaluation. To demonstrate the adaptability and practical value of our system beyond academic data, we apply SimpleStyle to transfer a wide range of text attributes appearing in real-world textual data from social networks. Additionally, we introduce a novel "soft noising" technique that further improves the performance of our system. We also show that teaching a student model to generate the output of SimpleStyle can result in a system that performs style transfer of equivalent quality with only a single greedy-decoded sample. Finally, we suggest our method as a remedy for the fundamental incompatible baseline issue that holds progress in the field. We offer our protocol as a simple yet strong baseline for works that wish to make incremental advancements in the field of attribute controlled text rewriting.
translated by 谷歌翻译
We present a neural technique for learning to select a local sub-region around a point which can be used for mesh parameterization. The motivation for our framework is driven by interactive workflows used for decaling, texturing, or painting on surfaces. Our key idea is to incorporate segmentation probabilities as weights of a classical parameterization method, implemented as a novel differentiable parameterization layer within a neural network framework. We train a segmentation network to select 3D regions that are parameterized into 2D and penalized by the resulting distortion, giving rise to segmentations which are distortion-aware. Following training, a user can use our system to interactively select a point on the mesh and obtain a large, meaningful region around the selection which induces a low-distortion parameterization. Our code and project page are currently available.
translated by 谷歌翻译
Pretraining has been shown to scale well with compute, data size and data diversity. Multitask learning trains on a mixture of supervised datasets and produces improved performance compared to self-supervised pretraining. Until now, massively multitask learning required simultaneous access to all datasets in the mixture and heavy compute resources that are only available to well-resourced teams. In this paper, we propose ColD Fusion, a method that provides the benefits of multitask learning but leverages distributed computation and requires limited communication and no sharing of data. Consequentially, ColD Fusion can create a synergistic loop, where finetuned models can be recycled to continually improve the pretrained model they are based on. We show that ColD Fusion yields comparable benefits to multitask pretraining by producing a model that (a) attains strong performance on all of the datasets it was multitask trained on and (b) is a better starting point for finetuning on unseen datasets. We find ColD Fusion outperforms RoBERTa and even previous multitask models. Specifically, when training and testing on 35 diverse datasets, ColD Fusion-based model outperforms RoBERTa by 2.45 points in average without any changes to the architecture.
translated by 谷歌翻译
Graph neural networks (GNNs) are widely used for modeling complex interactions between entities represented as vertices of a graph. Despite recent efforts to theoretically analyze the expressive power of GNNs, a formal characterization of their ability to model interactions is lacking. The current paper aims to address this gap. Formalizing strength of interactions through an established measure known as separation rank, we quantify the ability of certain GNNs to model interaction between a given subset of vertices and its complement, i.e. between sides of a given partition of input vertices. Our results reveal that the ability to model interaction is primarily determined by the partition's walk index -- a graph-theoretical characteristic that we define by the number of walks originating from the boundary of the partition. Experiments with common GNN architectures corroborate this finding. As a practical application of our theory, we design an edge sparsification algorithm named Walk Index Sparsification (WIS), which preserves the ability of a GNN to model interactions when input edges are removed. WIS is simple, computationally efficient, and markedly outperforms alternative methods in terms of induced prediction accuracy. More broadly, it showcases the potential of improving GNNs by theoretically analyzing the interactions they can model.
translated by 谷歌翻译
We present nBIIG, a neural Business Intelligence (BI) Insights Generation system. Given a table, our system applies various analyses to create corresponding RDF representations, and then uses a neural model to generate fluent textual insights out of these representations. The generated insights can be used by an analyst, via a human-in-the-loop paradigm, to enhance the task of creating compelling table reports. The underlying generative neural model is trained over large and carefully distilled data, curated from multiple BI domains. Thus, the system can generate faithful and fluent insights over open-domain tables, making it practical and useful.
translated by 谷歌翻译
Previous studies observed that finetuned models may be better base models than the vanilla pretrained model. Such a model, finetuned on some source dataset, may provide a better starting point for a new finetuning process on a desired target dataset. Here, we perform a systematic analysis of this intertraining scheme, over a wide range of English classification tasks. Surprisingly, our analysis suggests that the potential intertraining gain can be analyzed independently for the target dataset under consideration, and for a base model being considered as a starting point. This is in contrast to current perception that the alignment between the target dataset and the source dataset used to generate the base model is a major factor in determining intertraining success. We analyze different aspects that contribute to each. Furthermore, we leverage our analysis to propose a practical and efficient approach to determine if and how to select a base model in real-world settings. Last, we release an updating ranking of best models in the HuggingFace hub per architecture https://ibm.github.io/model-recycling/.
translated by 谷歌翻译
从有限的资源中获得最大收益可以进步自然语言处理(NLP)研究和实践,同时保守资源。这些资源可能是数据,时间,存储或能源。NLP的最新工作从缩放率产生了有趣的结果。但是,仅使用比例来改善结果意味着资源消耗也会扩展。这种关系激发了对有效方法的研究,这些方法需要更少的资源才能获得相似的结果。这项调查涉及NLP效率的方法和发现,旨在指导该领域的新研究人员并激发新方法的发展。
translated by 谷歌翻译
最近,人们对基于变压器的模型产生有意义的文本嵌入的能力越来越兴趣,例如文本相似性。尽管该领域取得了重大进展,但相似性预测的解释仍然具有挑战性,尤其是在无监督的环境中。在这项工作中,我们提出了一种无监督的技术,用于解释预先训练的BERT模型推断出的段落相似性。通过查看一对段落,我们的技术确定了决定每个段落的语义的重要单词,在这两个段落中的单词之间匹配,并检索解释两者之间相似性的最重要对。该方法已通过广泛的人类评估进行了评估,并在包含长期复杂段落的数据集中证明了这一方法,已显示出巨大的希望,提供了与人类看法更好相关的准确解释。
translated by 谷歌翻译
我们提出了Metricbert,这是一个基于BERT的模型,该模型学会了以明确的相似性度量嵌入文本,同时遵守``传统''蒙面语言任务。我们专注于学习相似之处的下游任务,以表明公制表现优于最先进的替代方案,有时要大幅度。我们对我们的方法及其不同的变体进行了广泛的评估,这表明我们的训练目标对传统的对比损失,标准余弦相似性目标和其他六个基线非常有益。作为另一个贡献,我们发布了视频游戏描述的数据集,以及由域专家制作的一系列相似性注释。
translated by 谷歌翻译